biosfandomcom_nl-20200214-history
Advanced Biological Data Analysis
17/1/2020 VM Tom Wenseleers - oral (you receive the questions in an R file) (exact same as 6/01/2019) # QUESTION 1. RESEARCHERS WERE INTERESTED TO TEST IF QUEEN PHEROMONES OF THE HONEYBEE, WHICH IN THAT SPECIES ARE EMITTED BY THE QUEEN TO STOP THE WORKERS FROM REPRODUCING, WOULD ALSO INHIBIT THE REPRODUCTION OF EITHER WORKERS OR QUEENS IN THE BUMBLEBEE. THEY HYPOTHESIZED THAT SUCH CROSS ACTIVITY MIGHT BE OBSERVED IN THE EVENT THAT THESE COMPOUNDS EXPLOIT CONSERVED PHYSIOLOGICAL PATHWAYS LINKED WITH THE REGULATION OF REPRODUCTION. TO TEST THIS HYPOTHESIS, THE RESEARCHERS EXPOSED BUMBLEBEE QUEENS AND BUMBLEBEE WORKER GROUPS (GROUPS OF 20 WORKERS EACH) TO EITHER A BLANK SOLVENT-ONLY CONTROL OR A SOLUTION OF HONEYBEE QUEEN PHEROMONES FOR A PERIOD OF 2 WEEKS. SUBSEQUENTLY, THEY DISSECTED THE QUEENS AND WORKERS AND MEASURED THE SIZE OF THE LARGEST OOCYTE IN THEIR OVARIES TO BE ABLE TO TEST FOR ANY EFFECTS ON OVARY DEVELOPMENT. FOR EACH OF THE WORKER GROUPS, THESE MEASUREMENTS WERE AVERAGED OVER ALL INDIVIDUALS. IN TERMS OF EXPERIMENTAL DESIGN, GENETIC BACKGROUND WAS CONTROLLED FOR BY DOING THE EXPERIMENT IN A PAIRED FASHION, WHEREBY THE WORKER GROUPS EXPOSED TO EACH REPLICATE CONTROL AND QUEEN PHEROMONE TREATMENT WERE DERIVED FROM THE SAME SOURCE COLONY AND SIMILARLY, THE QUEENS EXPOSED TO EACH REPLICATE CONTROL AND QUEEN PHEROMONE TREATMENT WERE TAKEN TO BE SISTERS OF EACH OTHER (COLONY OR SIBGROUP IS ENCODED AS VARIABLE "ID" IN THE DATASET). IN YOUR ANALYSIS, TAKE INTO ACCOUNT THIS NON-INDEPENDENCE THROUGH THE INCLUSION OF A RANDOM EFFECT TERM, AND USE A MODEL WITH THE APPROPRIATE ERROR DISTRIBUTION. AIM: TEST WHETHER HONEYBEE QUEEN PHEROMONES INHIBIT OVARY DEVELOPMENT AND WHETHER THIS EFFECT IS CASTE DEPENDENT. (cf. dataset "data.csv"). # SPECIFIC QUESTIONS: # 1A. Display your data using "spaghetti plots", i.e. plot oocyte size in function of treatment, using two different panels for caste, and connect points that are measured from individuals from the same colony (for workers) or sib-group (for queens). Make this plot both using lattice's xyplot and ggplot2. # 1B. Fit a model of oocyte size (SIZE_OOCYTE) in function of TREATMENT and CASTE, either considering a possible interaction effect between both or not, taking into account possible random effects and use a model with the appropriate error distribution. Decide which model is best based on the AIC criterion. What is the name of the type of model you fitted? (in this case a linear mixed effects model was ok, since the distribution or errors was normal: lme() or lmer()) # 1C. Make effect plots of the effects in your model and explain what these imply. # 1D. Carry out the relevant tests for the significance of the different effects. What do these tell you? Also carry out Tukey posthoc tests to test the effect of treatment for each of the two castes. What would the conclusion have been if you would have ignored the dependency in your data (variable ID, i.e. colony or sibship)? WOuld the effect of TREATMENT have been more or less significant then and why? (just do a normal lm here) # 1E. Test whether the residuals of your model conform to your assumed error distribution. Hans Jacquemyn (in a Word document) The marsh orchid (Dactylorhiza sphagnicola) and the heat-spotted orchid (Dactylorhiza maculata) (Fig. 1) are two orchid species that are able to hybridize when they co-occur. In the Belgian Ardennes, both allopatric and sympatric populations of both orchids can be found and there are some indications that hybridization occurs in sympatric populations. In order to get better insights in the hybridization process, the morphology of a large number of plants was investigated in a large sympatric population where both species co-occurred. In total 28 morphological traits were measured (Table 1). The same characteristics were also measured for a large number of plants in allopatric populations (one for each species). The data have been summarized in two datasets. The dataset sympatric.xlsx contains all data for the sympatric population, the dataset allopatric.xlsx contains the data for the allopatric populations. 1) Investigate whether the two species can be unequivocally distinguished based on the measured characteristics? 2) Which traits are most important to distinguish the two species? 3) Are there indications that hybridization has occurred in the sympatric population? How can you deduce this from your analysis? Illustrate your answer with the appropriate figures and explain which analysis you have used and why. 17/01/2020 NM Tom Wenseleers - oral (you receive the questions in an R file) # THE PROVIDED DATA FILE CONTAINS DETAILED DATA ON THE SURVIVAL OF THE PASSENGERS # OF THE TITANIC (Survived=0/1 for passengers that died or not) AS A FUNCTION OF, # AMONGST OTHERS, THEIR AGE ("Age"), THEIR SEX ("Sex") AND THE PRICE THEY PAID # FOR THEIR TICKET ("Priceperticket"). # A. FIT A MODEL OF PASSENGER SURVIVAL IN FUNCTION OF Age, Sex & Priceperticket WITH # THE APPROPRIATE ERROR STRUCTURE. START WITH A MODEL THAT TAKES INTO ACCOUNT ALL POSSIBLE # HIGHER-ORDER INTERACTION EFFECTS AND THEN USE STEPWISE BACKWARD MODEL REDUCTION AND # CALCULATE THE BEST MODEL BASED ON THE BAYESIAN INFORMATION CRITERION BIC. # TEST FOR THE POSSIBLE PRESENCE OF OUTLIERS & INFLUENTIAL OBSERVATIONS AND REMOVE #THOSE IF NECESSARY. # B. MAKE EFFECT PLOTS OF ALL THE PREDICTORS IN YOUR BEST MODEL (ON THE LINK SCALE # BUT USING Y AXIS RESPONSE PLOT LABELS, using type="rescale") # AND INTERPRET THE RESULTS. # C. CALCULATE THE LOG(ODDS RATIO) TO SURVIVE FOR MALE PASSENGERS OF AVERAGE AGE THAT # PAID FOR THE MOST EXPENSIVE TICKET VS THOSE THAT PAID FOR THE CHEAPEST ONE. # DO THE SAME FOR FEMALE PASSENGERS. DID THE LOG ODDS OF THEM SURVIVING GO UP # IN PROPORTION TO THE PRICE THEY PAID FOR THEIR TICKET, OR WAS THERE AN UNFAIR # ADVANTAGE FOR THOSE THAT PAID FOR THE MOST EXPENSIVE ONE? # # PASTE ALL R CODE + ALL TABULAR & GRAPHICAL OUTPUT & INTERPRETATION IN A WORD DOCUMENT # AND DO THE SAME FOR THE PART OF HANS JACQUEMYN AND HAND IT OVER TO THE ASSISTANT # SAVED AS LASTNAME_FIRSTNAME.DOCX & LASTNAME_FIRSTNAME.R (ZIP THESE TWO FILES # AS LASTNAME_FIRSTNAME_EXAM.ZIP) Given: library(effects) library(afex) library(car) library(MASS) library(emmeans) setwd("~/Dropbox/courses/stats_courses/stats_2019_2020/Advanced Biological Data Analysis 2019/exams 2019/17 jan 2020 14h") data=read.csv("titanic_data_full.csv") data$Priceperticket = data$Price/data$TicketCount head(data) Hans Jacquemyn (in a Word document) Hans Jacquemy Orchids rely on mycorrhizal fungi to complete their life cycle. Recent research has shown that these fungi can be quite diverse and belong to different genera. Fungal communities are also likely to vary between orchid populations. Little,however, is known about the factors determining variation in fungal communities and whether this variation affects the population dynamics of orchids. It canbe expected that differences in soil conditions have a significant impact on mycorrhizal communities and therefore impact on the population dynamics of the orchids. To test this hypothesis fungal communities were determined in a large number of populations of the terrestrial orchid Neottia ovata (Fig. 1). For each population a series of soil characteristics was measured and the change in population size was assessed by comparing the population size in 2003 with that in 2013. Data were collected in three separate plots in each population. The data have been summarized in two datasets, one that gives for each population the abundance (number of sequences) of all detected fungi (Neottia_fungi.xlsx), and one that gives an overview of the soil characteristics (Neottia_soil.xlsx). The soil characteristics that were measured are soil moisture content (moist), organic matter (OM), phosphate concentration (P), nitrate concentration (NO3) and ammonium concentration (NH4). The last file also tells you whether a population has increased (1) or decreased in size (0) between 2003 and 2013. 1) Assess whether fungal communities vary among populations? 2) Investigate whether variation in fungal communities can be related to soil conditions. Which soil variables have the largest effect on fungal communities? 3) Investigate whether the observed changes in population size can be related to variation in mycorrhizal communities. 06/01/2019 Tom Wenseleers - oral (you receive the questions in an R file) # QUESTION 1. RESEARCHERS WERE INTERESTED TO TEST IF QUEEN PHEROMONES OF THE HONEYBEE, WHICH IN THAT SPECIES ARE EMITTED BY THE QUEEN TO STOP THE WORKERS FROM REPRODUCING, WOULD ALSO INHIBIT THE REPRODUCTION OF EITHER WORKERS OR QUEENS IN THE BUMBLEBEE. THEY HYPOTHESIZED THAT SUCH CROSS ACTIVITY MIGHT BE OBSERVED IN THE EVENT THAT THESE COMPOUNDS EXPLOIT CONSERVED PHYSIOLOGICAL PATHWAYS LINKED WITH THE REGULATION OF REPRODUCTION. TO TEST THIS HYPOTHESIS, THE RESEARCHERS EXPOSED BUMBLEBEE QUEENS AND BUMBLEBEE WORKER GROUPS (GROUPS OF 20 WORKERS EACH) TO EITHER A BLANK SOLVENT-ONLY CONTROL OR A SOLUTION OF HONEYBEE QUEEN PHEROMONES FOR A PERIOD OF 2 WEEKS. SUBSEQUENTLY, THEY DISSECTED THE QUEENS AND WORKERS AND MEASURED THE SIZE OF THE LARGEST OOCYTE IN THEIR OVARIES TO BE ABLE TO TEST FOR ANY EFFECTS ON OVARY DEVELOPMENT. FOR EACH OF THE WORKER GROUPS, THESE MEASUREMENTS WERE AVERAGED OVER ALL INDIVIDUALS. IN TERMS OF EXPERIMENTAL DESIGN, GENETIC BACKGROUND WAS CONTROLLED FOR BY DOING THE EXPERIMENT IN A PAIRED FASHION, WHEREBY THE WORKER GROUPS EXPOSED TO EACH REPLICATE CONTROL AND QUEEN PHEROMONE TREATMENT WERE DERIVED FROM THE SAME SOURCE COLONY AND SIMILARLY, THE QUEENS EXPOSED TO EACH REPLICATE CONTROL AND QUEEN PHEROMONE TREATMENT WERE TAKEN TO BE SISTERS OF EACH OTHER (COLONY OR SIBGROUP IS ENCODED AS VARIABLE "ID" IN THE DATASET). IN YOUR ANALYSIS, TAKE INTO ACCOUNT THIS NON-INDEPENDENCE THROUGH THE INCLUSION OF A RANDOM EFFECT TERM, AND USE A MODEL WITH THE APPROPRIATE ERROR DISTRIBUTION. AIM: TEST WHETHER HONEYBEE QUEEN PHEROMONES INHIBIT OVARY DEVELOPMENT AND WHETHER THIS EFFECT IS CASTE DEPENDENT. (cf. dataset "data.csv"). # SPECIFIC QUESTIONS: # 1A. Display your data using "spaghetti plots", i.e. plot oocyte size in function of treatment, using two different panels for caste, and connect points that are measured from individuals from the same colony (for workers) or sib-group (for queens). Make this plot both using lattice's xyplot and ggplot2. # 1B. Fit a model of oocyte size (SIZE_OOCYTE) in function of TREATMENT and CASTE, either considering a possible interaction effect between both or not, taking into account possible random effects and use a model with the appropriate error distribution. Decide which model is best based on the AIC criterion. What is the name of the type of model you fitted? (in this case a linear mixed effects model was ok, since the distribution or errors was normal: lme() or lmer()) # 1C. Make effect plots of the effects in your model and explain what these imply. # 1D. Carry out the relevant tests for the significance of the different effects. What do these tell you? Also carry out Tukey posthoc tests to test the effect of treatment for each of the two castes. What would the conclusion have been if you would have ignored the dependency in your data (variable ID, i.e. colony or sibship)? WOuld the effect of TREATMENT have been more or less significant then and why? (just do a normal lm here) # 1E. Test whether the residuals of your model conform to your assumed error distribution. (he wanted to see a histogram of the residuals) Hans Jacquemyn (in a Word document) Data set: Centaurium.xlsx Evolutionary theory predicts that coexistence of two closely related plant species can have a major impact on floral morphology and plant mating systems. To test this prediction, floral morphology of a large number of individuals from both allopatric and sympatric populations of Common centaury (Centaurium erythraea) and Seaside centaury (C. littorale) (Fig. 1) were investigated along the Belgian coast. For each individual seven floral traits were measured (Table 1). Table 1 Floral traits measured in allopatric and sympatric populations of Centaurium erythraea en C. littorale along the Belgian coast. 1) Investigate whether floral morphology differs significantly between flowers of Centaurium erythraea and C. littorale. 2) Illustrate graphically which variables best explain variation in floral morphology between both species and indicate which variable most determines the difference between the two species. 3) Can you notice a difference in floral morphology between allopatric and sympatric populations? Is this difference determined by the identity of the species? If yes, explain how and illustrate with the appropriate graph. 4) Based on your results, can you conclude that allopatric populations can as easily discerned from sympatric populations for C. erythraea as for C. littorale? And which floral traits are best suited to discern allopatric from sympatric populations?